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Abstract 

This paper presents a control architecture in which a direct adaptive control tech¬ 
nique is used within the model predictive control framework, using the concurrent 
learning based approach, to compensate for model uncertainties. At each time step, 
the control sequences and the parameter estimates are both used as the optimization 
arguments, thereby undermining the need for switching between the learning phase 
and the control phase, as is the case with hybrid-direct-indirect control architectures. 
The state derivatives are approximated using pseudospectral methods, which are vastly 
used for numerical optimal control problems. Theoretical results and numerical simu¬ 
lation examples are used to establish the effectiveness of the architecture. 


1 Introduction 

Model predictive control (MPC) refers to a class of control systems in which the current 
control action is obtained at each sampling instant by solving a finite(or infinite) horizon 
open-loop optimal control problem, using the current state of the system as the initial 
condition. While the result of the optimization is a sequence of control actions over the 
prediction horizon, only the first control action is applied at the current time; the process 
is repeated at the next time instant. Using this framework, it is easy and straightforward 
to cope with hard constraints on controls and states. As a result, MPC has received a 
lot of attention in the literature for both discrete and continuous time systems use 
Eg ESS 0 ED EDI US El- However, due to the dependence on dynamic predictive model, 
unaccounted modeling errors and dynamic uncertainties may render such model obsolete 
or inaccurate. In which case, the performance of the MPC can no longer be guaranteed. 
To overcome this challenge, a number of researchers have proposed some indirect-adaptive 
MPC approaches which allows for a way to incorporate learning in the MPC framework 
m 011 ED]. Using these approaches, the system parameters are estimated online and 
open-loop optimal controllers are generated at each time step. One major challenge of this 
approach, however, is that it is difficult to guarantee stability, especially during parameter 
estimation transient phases [251 . 

On the other hand, Direct adaptive control techniques modulate the system input to 
compensate for modeling uncertainties. Direct adaptive control can guarantee stability, 
even during harsh transients, however, they do not offer any long-term improvement due 
to model learning unless the system states are persistently exciting. Furthermore, it is 
difficult to generate optimal solutions in the presence of input and state constraints with 
direct adaptive architectures [T2] , 

1 A bounded vector signal <E>(t) is persistently exciting if for all t > to there exists T > 0 and 7 > 0 such 

that / t t+T $(r)3>(r) T dr > 7 I. 
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In H2], a Concurrent Learning based approach was proposed to address the above chal¬ 
lenges. Concurrent learning (CL) ill, Q3] uses recorded and current data concurrently to 
learn the parametric uncertainties in a dynamic system. Although it was first introduced 
for adaptation in the framework of Model Reference Adaptive Control (MRAC) [TTj . it 
can, as a result of the form of it’s update law, be easily extended to the general framework 
of adaptive control with linear-in-the-parameter (LP) structure. It was shown [TI: that 
provided that the recorded data satisfies certain rank condition, then the adaptive weight 
convergence can occur without the system states being persistently exciting. 

In this paper, a direct adaptive technique is used within the MPC framework, in con¬ 
junction with the Concurrent Learning based approach, to compensate for model uncer¬ 
tainties. At each time step, the control sequences and the parameter estimates are both 
used as the optimization arguments, thereby undermining the need for switching be¬ 
tween the learning phase and the control phase, as is the case with hybrid-direct-indirect 
control architectures mm employed in m- Moreover, the state derivatives are approx¬ 
imated at the recorded data points and over the prediction horizon using pseudospectral 
method. Pseudospectral methods are vastly used in the numerical solution of optimal 
control problems pTJ [TH 02 EH |22] . They belong to a class of direct collocation methods 
where the optimal control problem is transcribed to a nonlinear programming problem 
(NLP) by parameterizing the state and control using global polynomials, and collocating 
the differential-algebraic equations using nodes obtained from a Gaussian quadrature. 
With this approach, it is easier to formulate the problem without requiring that the 
system dynamics be linearly parameterizable. 

The rest of the paper is organized as follows: In Section[H the notations used throughout 
the paper are introduced. In Section [3l the CL problem is reformulated as an optimiza¬ 
tion problem to facilitate its inclusion into MPC framework. In Section HI the problem 
setup for the concurrent learning model predictive control is given. In Section [5l the 
pseudospectral implementation is presented. Numerical examples are given in Section [G] 
Conclusion follows in Section 0 


2 Notation 


Throughout the paper, the following notations are used: R and R.+ denotes the set of 
real numbers and positive real numbers respectively. All vectors and vector functions 
are treated as row vectors; that is, x(r) = [si(r),..., x n {r)] G R ra , where n is the 
continuous time dimension of x(r). The Euclidean norm of a vector x G R” is denoted by 
||x|| = (x T x) 1,/2 . The quadratic form ||x||p = x T Px is defined for any symmetric positive 
semi-definite matrix P. The expression P A Q means that the matrix P — Q is negative 
semi-definite. The transpose of a matrix B is denoted by B 2 . The *th row of a matrix D 
is denoted by D, . The gradient of a scalar valued function /(X) is a row vector denoted 
by V/(X). The matrix ig (..., Oh , ■ ■ ■) G R pxn denotes the partial derivative of a vector 
valued function f(..., 0,...) : ... x R p x ... H> R ra with respect to the argument 0 G R p , 
evaluated at 0 = Oh- The £2 norm of a vector-valued signal x : R + 1 —>■ R” is given by 

( r °° \ 1/2 

l|x(r)|| 2 drj 

"H“ denotes the n -vector valued Sobolev space over the interval [—1, 1], with a denoting 
the number of classical derivatives of its elements. 7~L a is given with respect to the £2 
norm as, 


l'^'ll(a) ~ E 


, 0 ) 


\k —0 



2 




The space of all bounded functions is denoted by L^. 


3 Concurrent Learning 

In this section, the original CL problem is reformulated as an optimization problem. 
This facilitates a direct inclusion into the MPC framework, as will be shown in the next 
section. The class of system considered is described by the following set of nonlinear 
ordinary differential equations: 


x(t) = f(x(f),u(f),0), x(0)=x o , (1) 

where x(f) £ R™ is the vector of state variables, u(i) £ R m is a vector of inputs, and 
e £ © C R p is a vector of unknown constant parameters. The following assumptions are 
made for the system described in dT]) (see [TO] also for a similar set of assumptions): 

(Al). f : R" x R m x R p — > R” is twice continuously differentiable and f(0, 0, 0) = 0, V6 £ 
©. That is, 0 £ R n is an equilibrium of the system with u = 0. 

(A2). u (t) £ U , where U C TZ m is compact, convex, and 0 £ R m is contained in the interior 
ofW. 

(A3). The system in QJ has a unique solution for any initial condition xo £ R” and any 
piecewise continuous and right continuous u(.) : [0,oo) —> U , for all 0 £ ©. 

Let 


= {x-h ■ *H — f(xfl-, u/j, 9) = 0} (2) 

be a set of recorded data generated by the system in © from a given open-loop input 
sequence ur(tr), tr £ [0, T], and unknown constant parameter 9. 

The following definition of persistence of excitation is adopted for the subsequent de¬ 
velopment in this paper. 

Definition 1 (Persistence of Excitation (PE)). The system in QJ is said to be persistently 
exciting with respect to the open loop input sequence u(t), if there exists Ai, A 2 > 0 such 
that 


Ai I R / fe(x ff , u r{t h ), 0 H )i g (-XH, u ff (r), 0 H ) T dr < A 2 /, 


( 3 ) 


for all 9r £ ©. 


Let 9 be an estimate of the unknown parameter 0, the performance index 


e(9(t)) 



xr(tr) - f(x ff (tr), u H {T H ),9{t)) 


2 

dr H . 


( 4 ) 


is defined to characterize the “goodness'll of the parameter estimate 9. Next, the rela¬ 
tionship between e(9) and the parameter estimation error is exploited. 

2 This term is used to describe how close the response, generated using the estimate, is to the actual 
recorded data 
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Theorem 1. Suppose the system in © is persistently exciting with respect to the open- 
loop input sequence u # (i), then for all e > 0, there exists 5 > 0, satisfying S — > 0 as e —> 0, 


such that 


e-o 


< 6 whenever e{9) < e. 


Proof. After using 0, the equation in 0 can be written as 


e(0) = / f(x H ,u ff (r),6>) - f(x ff , (r), 0) 

J o 


dr. 


It is clear to see, using the Mean Value Theorem, that 

f(x H , u fl (r), 0) - f(x ff , u H (r), 0) = (o - G^J f g(x H , u h (t), 0 h ), 

where 

0// = a6 + (1 — a)#, at £ [0, I]. 

Thus, the equation in (0 becomes 

e(0) = (o - ej (j f 9 (x H , u H (r), 0h)U{x-h , \ih{t) 1 O h ) t <1t\ (o - o'j 


(5) 


( 6 ) 


(7) 


Now, using m , it follows that 


2 < < 0 ). ( 8 ) 

Thus, the conclusion follows by setting 


Ai 


0 0 


6 = 


VI 

Ai 


□ 

Remark 1. The model error given by the performance index in 0 requires the computa¬ 
tion of the state derivatives. This can be computed accurately using numerical smooth¬ 
ing techniques mm. However, as will be shown in subsequent sections, the need to 
compute state derivatives is abated by transforming the problem using pseudospectral 
approximation. 

Remark 2. Theorem [T] shows that the smaller the value of the performance index in a, 
the smaller the 2-norm of the parameter estimation error. Thus, the parameter estimation 
error can be reduced as much as possible by setting 

9 = arginine!#'), 
e' 

4 Concurrent Learning Adaptive Model Predictive Control 
Scheme 

In this section, the problem setup for the concurrent learning adaptive model predictive 
control is given. Following each measurement, an open-loop optimal control is solved. 
The objective function to minimize comprises of the performance index given in 0, and 
an additional cost functional which penalizes the state and control in accordance with 
standard MPC setup. Moreover, the arguments of the optimization is the pair (u(.),#). 
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In other words, at each time step, the values of the open-loop control sequence u(.) and 
a constant parameter estimate 6 that minimizes the combined cost functional is found. 
In particular, the open-loop optimal control problem at time t, with initial state x(t), is 
formulated as 


min J(x(£),u(.),0), (9) 

(u(.),0) 


where 


J(x(£),u(.),0) 



(ll x ( r , x(£))||q + ||u(t)||^) dr + 7 e( 0 ), 


( 10 ) 


subject to 

x = f(x, u, 0), x(t, x(t)) = x(t) ( 11 a) 

u(r) e U, t e [t, oo), (fib) 

where 7 > 0, and Q £ R nxra and R £ R mxm are positive definite symmetric weighting 
matrices; x(t, x(f)) is the state trajectory of the system in (111all , starting from the initial 
state x(t), and driven by the open-loop control sequence u(r),r £ [t, 00 ). Without loss 
of generality, an infinite-horizon nonlinear model predictive control problem is considered. 
For a finite-horizor[f| case, the problem can be setup to include an additional quadratic 
terminal cost chosen to ensure that a closed-loop asymptotic stability is guaranteed. 

According to the receding horizon philosophy, the resulting open-loop optimal control 
profile is applied to the system only until the next measurement becomes available. Let 
T s be the measurement sampling time, and (u*(r, x(f)), 0 (x(f)) the optimal solution to 
the optimization problem © (lllbl) . then the closed-loop control and parameter estimate 
are given by 


u(r) = u*(r,x(£)), t £ [t, t + T s ] (12) 

0 (r)=0 *(x(t)) 

+ k s 1 j t J ( x ir - f(xff,UH(r ff ),0(£))j T(T H ) T dT H da, (13) 

within the time interval r £ (t, t + T s ). 6 = 0*(x(t)) for r £ {0,T S ,2T S ,...}. Moreover, 
T : M + —» R nxp is chosen to satisfy 

Ail A f fg(x H , u h (th), 0 h )r(r ff ) T dr H d: A 2 /, (14) 

Jo 

for all Oh £ ©, and kg is a positive constant, with kg 1 being the concurrent learning gain. 
It should be noted that the optimal parameter 6* is merely a decision variable internal to 
the optimization problem in © (lllbl) . The update law in (fl3l) is given as a by product 
of the proposed method. The asymptotic convergence to the true parameter, using the 
given update law, is shown subsequently. Knowledge of the true parameter in the system 
can be used for several purposes as desired by the user. For instance, diagnostic purposes, 
as a means to switch between controllers, etc. Once new measurements become available 
(after T s time units), the optimization problem in © (lllbl) is solved again to find new 

3 Interested readers are directed to references ITOl [51 )~25l 1291 . 
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input profiles, the closed-loop control and parameter estimate in m and m are then 
applied within the time interval r £ [t + T s , t + 2T s ], and so on. Note that, while the 
parameter estimate at time t is a function of the state measurement x(t), it is treated as 
a constant throughout the prediction window in the optimization problem in ([9]) (lllbl) . 
This applies also to all other measurement points t + kT s , k = 1,2,.... Consequently, 
the closed-loop system is described by the ordinary differential equation 

x(f) = f(x(t), u(f), 0(f)). (15) 

Next, in the following subsection, the stability properties of the closed-loop system is 
considered. 

4.1 Stability Analysis 

The following standard definitions, adapted from [26], describe the notion of stability as 
used in this paper. 

Definition 2 (Stability). The equilibrium point x = 0 of the system in (JT]) is stable if 
for each e > 0 there exists 77 (e) > 0, such that ||x(0)|| < 77 (e) implies that ||x(f)|| < e for 
all t > 0. 

Definition 3 (Asymptotic Stability). The equilibrium point x = 0 of the system in (JTJ 
is asymptotically stable if it is stable and 77 can be chosen such that ||x(0)|| < 77 implies 
that x(t) —> 0 as t —> 0. 

Next, in order to facilitate subsequent stability analysis, an important property of the 
optimal value function is examined. For simplicity of exposition, except required for 
clarity, the shorthands 

J(x(f)) = J(x(f),u,x,0) 

= J(x(f),u*,x*,0*) 


are used. 

Lemma 1. Suppose the system in (HD is persistently exciting with respect to the open-loop 
input sequence u# (t). If the kg in (fl3l) is chosen to satisfy the sufficient condition 

ke > (16) 

where 


A 3 — max {A^, AJA 2 , A 1 A 2 , A 2 } , A 4 — min { Af, A 1 A 2 , A 2 } 
[Ai and A 2 are given in (0 and (fl4l) ] 

then the optimal value function J(x(f), u(r), 9(r)) = J*(x(f)) satisfies 


J*(x(s)) < J*(x(t))- / ( ||x(t)||q + ||u*(r)||^j + 2/3 6{r) 


dr , 


(17) 


for all s e (t, t + T s ], where 


P = 


ikgXi 


2 (T s A 2 + kg) 


2 > 


(18) 


and 6{t) = 9 — 8{t) is the parameter estimation error. 
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Proof. At time t, the optimal value function, using the closed-loop control in (fl2l) and 
parameter estimate in (1131) . is given by 


J *( x (t)) = j (ll x *( r i x (£))||q + ||u(t)||^j) dr + 'ye(G(t)). (19) 

Now, for all s £ (t, t + T s ], the value of the objective cost functional in ((31) is given as: 

J ( X ( S )) = / (ll x *( r > x ( i ))llQ + ll u ( T )llfl) dr + 7 e( 0 (s)), (20) 

= £ (ll x *(A x (i))ll| + l|u(r)||^ dr + je(G(s)) 

~ it + dr ( 21 ) 

= - j t (||x*(T, x (t))||g + ||u(r)||^ dr 

+ 7 ( e (0(«))- £ (fl(t))). (22) 


For the sake of clarity, let 

>F = / {g(XH,UH(TH),OH)tg(^-H,UH(T H ),0H) T dT H 

J 0 

'h = f ig(yi Hl nH{T H ),e H )T(T H ) T dT H . 

Jo 

Thus 

e (0(r)) = 0(r)$0(r) T , (23) 


which implies that 

e(0(s)) - e(0(t)) = (0(3) - 0(tj) $ (0(3) - m) T 

+ 2 9(t)$ (d(s)-G(t)Y . (24) 

From (fl3l) . we have that 

G(s) -0(t) = kg 1 J (x H - {(xH,UH(T H ),G(t))^ r(r ff ) T dr ff dcr, 

which, after using the Mean Value Theorem, and following similar argument in ((SJ) and 
0 yields 


0(s) - 0(t) = -kg 1 J 0(t)^dr 2 = — (s - t)jfeg ^(i)*. 
Now, by using the properties in ([3]) and (fTTl) . it is clear that 


(25) 


Ai I A $ A \ 2 I 
Ail r< ’F r< A 2 I 


9(t)'&<f>'S> T G(t) T < A 3 G{t.) 
A 4 G(t) 2 < e{t)^ T G(t) 7 


2 
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Thus, (El) becomes 


e( 0 (s)) - e( 0 (f)) = (s- t) 2 kg 2 e(t)^^ T 0(t) T - 2 (s - t)kg 1 0{i:)^ T dit) 1 


<{s — t) 2 k e 2 A 3 0 (f) — 2 (s — t)k g *A 4 0 (f) 


= (s - f)fc e 1 ((s - f)fc e % - A 4 ) 0(f) 


- / ^“^4 


0 (f) dr. (26) 


Now, since (s — f) < T,, 


kg > —y—— => kg > -——-—— => (s — f)fcg X A 3 — A 4 < 0. 


A 4 


a 4 


Thus 


dr. 


e( 0 (s)) - e( 0 (f)) <-y kg x A 4 0 (f) 

Moreover, from we have that 

0(r) = 0(f) (-(r - f)^ 1 ^ + I) , r G (f, f + T s ] 

which implies that 

|0(r)|| 2 < (T s fc e - 1 A 2 + l) 2 |0(f) 

Thus, the inequality in (El) yields 

e( 0 ( S ))-e( 0 (f))<- ['* 

Jt 7 

Substituting (15U1) in (El) yields 


0(r) 


dr. 


J(x(s)) < J*(x(f)) - / I ||x(t)||q + ||u*(r)|| fl + 2/? 0(r) 1 dr. (31) 


Finally, using the optimality of the value function at s, it follows that 

J*(x(s)) < J(x(s)), 


(27) 


(28) 


(29) 


(30) 


(32) 


which implies that 

J*(x( S ))< J*(x(f))- f ( || x (r)||g + ||u*(r)||^j + 2/3 0(r) 


dr. 


□ 


Now, the asymptotic stability result for the closed-loop system in (fl5l) is stated in the 
following theorem. 

Theorem 2. Suppose that the assumptions (Al)-(A3) are satisfied, also that the suf¬ 
ficient condition and the hypothesis of Lemma [T] is satisfied, and that the open-loop 
optimal control problem in © (lllbl) is feasible for all f > 0, then the closed-loop system 
in (USD, in the absence of disturbance, with the model predictive control in (1121) and the 
concurrent learning based update law in (1131) . is asymptotically stable with asymptotic 
parameter convergence. 














































Proof. The proof stated here follows similar argument given in m, with modifications 
made to include the parameter convergence. First, define the function V(x, 0) for the 
closed-loop system in (ITU) as follows: 

nt 


l/(x(t),0(t)) = J*(x(f)) + / /3 0(r) 


dr. 


(33) 


Jo 

Then, V (x, 0) has the following properties: 

• V(0, 0) = 0 and V(x, 0) > 0 for (x, 0) ^ ( 0 , 0 ), 

• V(x, 0) is continuous at (x, 0) = ( 0 , 0 ), 

• along the trajectory of the closed-loop system starting from any Xo £ X, and for 
0 < ti < t 2 < oo 

ft 2 


V(x(t 2 ),0(t 2 )) - F(x(ti),0(ti)) < - 


XT 


\q 


P 


G(r) 


dr. 


(34) 


To prove the first property, note that 6 = 0 => 6 = 9, which , from m, implies that 
e(0) = 0. Thus, It follows from Lemma A.l in reference [9] that J*( 0 ) = 0. Consequently, 
C(0,0) = 0. Similarly, the second property follows from the continuity of f(.,.,.) over 
0, and Lemma A.l in reference (5). The third property is due to Lemma Q] and R > 0. 
As a result, using standard argument (see [25]), it can be shown that the equilibrium 
(x, 0) = (0, 0) is stable, in accordance with the Definition [2] That is, for each e > 0, 


there exists 77 (e) > 0 , such that 


<t) m 


< £ for 


x( 0 ) 0 ( 0 )] < 77 (e) implies that 

all t > 0. Moreover, V(x(i), 6(t)) £ L^, \/t > 0, along the closed-loop trajectory. Next, 

it will be shown that there exists 77 > 0 such that (x(t),9) —>■ (0,0) as t —> 00 for all 
[x(0) 0(0)] < 77 . This implies that the equilibrium (x, 0) = (0, 0) is asymptotically 

stable, in accordance with Definition [3] 

Starting out with the inequality in (1511) . it follows by induction that 


|x(r) 


'Q 


0 (r) ) dr < V (x( 0 ), 0 ( 0 )) — V (x(oo), 0 ( 00 )). 


Since V(x(oo),0(oo)) > 0 and V(x(0), 0(0)) £ L^, it follows that 


XT 


I Q 


0 (r) 


dr £ L 0 


dr £ L 0 


(35) 


(36) 


Thus, 


which further implies that ||x(t)||q dr £ and / 0 °°/? 9(t) 
x(t),0 £ C. 2 - Furthermore, ]x(t) 0(t)] £ L^, U compact , and f(.,.,.) continuous 

implies that f(x(t), u (t), 0(f)) £ for all t £ [0, 00 ). Thus x(f) is uniformly continuous. 
Also, computing the derivative of 0 from (1281) using first principle yields 


~ ]im 

(5-s-O 


O(t + 6)-0(t) , , 


= k g i 4'0(t) £ L 0 


Thus, 9{t) is also uniformly continuous. Consequently, ||x(f)|| and 0(f) 
continuous in t on [0, 00 ). Thus, it follows from Barbalat’s Lemma ([26]) that 


(37) 


are uniformly 


c(t)|| —> 0 , and 0 (f) 


0 , as t —> 00 . 


(38) 

□ 
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5 Pseudospectral Implementation 


In this section, the open-loop infinite horizon optimal control problem in © (lllbl) is 
transcribed into an NLP using pseudospectral method. First, the details of the collocation 
are given. Then, the resulting NLP for the optimal control problem is formulated. The 
effect of the pseudospectral approximation on the stability of the resulting closed-loop 
system is also examined. 

The most commonly used sets of collocation points are Legendre-Gauss (LG), Legendre- 
Gauss-Radau (LGR), and Legendre-Gauss-Lobatto (LGL) points. They are obtained 
from the roots of a Legendre polynomial and/or linear combinations of Legendre poly¬ 
nomial and its derivatives. All three sets of points are defined on the domain [—1, 1], 
but differ significantly in that the LG points include neither of the endpoints, the LGR 
points include one of the end points, and the LGL points include both of the endpoints. 

The LGR collocation scheme is used for the purpose of this paper. The reason for 
this is because using the pseudospectral form of the LGR scheme results in a system of 
equations that has no loss of information from the integral form (this is due to the special 
form of the resulting differentiation matrix) |22j . For the infinite horizon part of the cost 
functional in (1101) . the interval [—1, I] is mapped into [t, oo) using the change of variable 

t = 4>{v i), (39) 

where <f> is a differentiable, strictly monotonic function. Three examples of such functions 
are given, based on the ones given in references , as 


<t>a{v l) ~t+. 

(40) 

I — 

V\ 

(f>b(v l) = t + log | 

o-J 

(41) 

= t + log 

( 4 ^ 

(42) 



For the recorded data (the finite horizon part of the cost functional), the interval [0, T] 
is mapped into [—1, 1] using the affine transformation 

t h = ^{v 2 + 1). (43) 

Let S(v i) = dcj)/dvi = (j)'{y i), then the infinite horizon optimal control problem in © 
(lllbl) becomes 


min J(x(t),u(.),x(.),0) 

(u(.),u(.),0) 

r+1 


1 S(vi) (||xM||q + ||u(zu)||^) dv 1 


IT 

2 


r: 


Xff (zz 2 ) - — f(x ff (i/ 2 ), u H (u 2 ), 0) 


dvo. 


subject to 


x(zq) = 5(i/i)f(x(i/i), u(z/i), 0), x( 1) = x(f) 
u(^i) G U. 


(44) 


(45a) 

(45b) 
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Here, x(zzi), u(i'i) and Q(v i) denote the state, the control and the parameter estimate as 
a function of the new variable v\. The independent variable z/ 2 denotes the transformed 
time variable for the recorded data. 

Next, the discrete approximations using LGR pseudospectral scheme is described. Con¬ 
sider the LGR collocation points — 1 = ti<...<tjv<+ 1, and the additional non collo¬ 
cated point t/v +1 = +1. The interior of the collocation points are given by the zeros of the 
derivative of the iVth-order Legendre polynomial Lp N ( x ), i.e {t ? } 2 1 = {x : L' Pn (x) = 0} 
[7 ; . The state is then approximated by a polynomial of degree at most N as follows: 


N+l 


x(z/i) 


(46) 


N+l 


x_f/(z/ 2 ) 

~ X X-HjLjfo), 

(47) 

AjO) 

N+l 

1—r V — Th 

= n — -> j = i . Jv+i, 

L i T i - Tk 

(48) 


where Lj is a basis of TVth-degree Lagrange polynomials. Differentiating the state ap¬ 
proximations in (l46l) and (|47]i . and evaluating at the collocation points yields 


N+l JV+l 


i( n ) « x X j4(a) 

= X D yX, = d,a, 

(49) 

3 =1 

3 =1 


N+l 

N+l 


x ff (n) w X 

) = X = D iX H 

(50) 


3 =1 3 =1 


where 


' X! " 




and Xh = 


_Xtv + i_ 


X-H n +i_ 


The matrix D £ 1Z Nx(N+i ) entries Dij, (i = 1 = 1,..., N + 1) is the 

Radau Pseudospectral Differentiation Matrix, since it transforms the state approximation 
at the points ti, ..., tjv+i to the derivatives of the state approximation at the LGR points 
Ti,... ,tjv- As result, using this formulation averts the use of any numerical smoothing 
techniques, otherwise needed to compute the state derivatives for the recorded data. 

It is noted that the matrix Xh is composed of the state approximations of the recorded 
data at the collocation points only. These are generally unknown, since the recorded data 
are assumed to be measured at specific points which are generally not the collocation 
points. As a result, a transformation is needed to express Xh in terms of the measured 
recorded data X p £ ]^JV m x(iv+i)^ w j iere _/\r m j s number of measurement points. It is 
required that N m > N to ensure that the corresponding measured data maps to a unique 
set of Xh- Let z/ 2 = cri, ■ ■ -, VN m denote the measurement points for the recorded data, 
then from (l47l) 


N+l 

Xg(o-fc) ~ X g .Iq(cr fc ), k=l,...,N m . 

3=1 


(51) 
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Thus, 


x% = M x X h , (52) 

where the matrix M x G M. NmX ( N+1 ' 1 has entries M Xkj = Lj(<Jk)- Since N m > (N + 1), it 
follows from the orthogonality of the Legendre polynomials that rank(M) = N + 1. As a 
result 


X H = MlXft = (M X M X ) 1 MjA']? (53) 

will yield a unique state approximation data Xh for every unique measured state data 
Xfl. Similarly, the open-loop control signals at the collocation points are given in terms 
of the open-loop controls at the measurement points as 

U H = MlU% = (M„ T M u ) - 1 M T U U%, (54) 

where the matrix M u G WL Nm * N has entries M Ulcj = Lj(ak)- 

Let U G 7 Z Nxm be a matrix whose *th row is an approximation to the control 
u ( T i); 1 < i < N. The discrete approximation to the system dynamics in (145al) is 
obtained by evaluating the system dynamics at each collocation point and replacing x(r^) 
by its discrete approximation D^X. Hence, the discrete approximation to the system 
dynamics is given by 


DjX = Ui, 8), 1 <i< N. (55) 

Next, the objective function in (El) is approximated by a Legendre-Gauss quadrature as 
follows: 


N 

j » Wj 
i=l 




7 T 
2 


- -f&MlX 


M.m 


H i c i lvl u' J H > 


(56) 


where e, is the ith row of the identity matrix of appropriate dimension and Wi is the 
quadrature weight, associated with t-i, given by [23, 

Wi={ mpLlUf , (57) 

l Jt 2 Ti — ~ 1 

where Pn -i is the (N— l)th Legendre polynomial. The continuous-time nonlinear infinite- 
horizon optimal control problem in ([9j (lllbl) is then approximated by the following NLP: 


min J(x(f), [/, X, 8 ) 

(U,X,0) 


N 

2=1 





t>M x x 


m 

H 


T 


f(eM x X%,eM u U%,d) 


(58) 


12 















subject to 

DiX - S(Ti){{Xi, Ui, 0) = 0, 1 < i < N , (59a) 

x(t) - Xi = 0, (59b) 

U i£U, 1 <i<N (59c) 

0 e 0. (59d) 


Let U*, 1 <i < N and 0* be the solution of the NLP in (1551) (I59dl) . then the closed- 
loop control and parameter update laws in m and m becomes 

N 

u(T)=^U J U J -(r 1 (r)), (60) 

3 =1 

9{t) = r + ^251 ^ r (^) T (DiMtXE? - |f(eiMtx^, 0*)). (61) 

j=i ' ' 

Also, the PE condition requirement of Lemma |T] reduces to the rank condition 

rank 

for all G][ (E ( M ). This is consistent with the original work in m for the special case with 
LP assumption. 


N 

E 

3 =1 


wM^mIXh, eiMlU%,Q H )f e {eiMlX 


Hi 


eiMtU^,G H ) T 


= P, 


(62) 


5.1 Stability Considerations 

Next, the effect of the pseudospectral approximation on the stability of the system is 
examined. First, some existing established results on the properties of pseudospectral 
approximations are provided. From these results, the stability of the closed loop system 
resulting from the control law in (l60l) is studied. Similar to Section |4j except otherwise 
required for clarity, the shorthands 

J(x(f)) = J(x(t),fl,x, 0) 
j*(x(t))4j( X (i),u*,x*,r) 
j(x(t))4 J(x(t),U,X,0) 

J*(x(i))4J(x(t),[/*,r,r) 


are used. 

Lemma 2 (Interpolation Error Bounds )7j, Section 5.4.3). If x = [x \,..., x n ] G ?!“, with 
Xi € H 0 , i = 1,..., n, then there exist X ; - = x(tj), j = 1,..., N+ 1, and C\, C\ ,, C 2 , C 2 * > 0 


such that: 

(a) The interpolation error is bounded, 


N +1 

n 

AT+1 

X ( T ) - E X 3 L A T ) 

^E 

— 'y ' XijLj(r) 

3=1 

2 i=1 

3=1 


<E Cl ^ “INI(a) 

2—1 


< ciAT" a . 


(63) 
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(b) The error between the exact derivative and the derivative of the interpolation is 
bounded, 


||x(t) - D(t)X|| 2 < 

2=1 


N+l 

Xi — 'y ( XijLj(r) 
j =i 


< c 2 N 1 ~ a , 


<Y C2 > Nl "INI (a) 


i=l 


(64) 


where, D(r) = [L^t), L 2 (t), ..., L n+1 (t)]. 


Remark 3. It is straightforward to see, using the orthogonality property of the Lagrange 
interpolation polynomial, that the interpolation error is zero at the collocation points. 
In other words, the approximation is exact at the interpolation points. As a result, 
any feasible point of the optimization problem in (TH1) (I45bl) represents the actual system 
dynamics at the collocation points and the error due to interpolation elsewhere is governed 
by Lemma [5] 


Lemma 3 (Feasibility, Convergence, and Consistency of pseudospectral approximations 
HU). Let x*(t) e fin, u*(t) € and 0* be the solution of the optimal control problem 
in (ltil) - (l45bl) . and X* and U*, the solution of the corresponding NLP in (1551) (I59dl) . then 
the error in the optimal cost functional due to the pseudospectral approximation can be 
upper bounded as follows; 

|j(x(f),u*,x*,0*)- J(x(f),t/*,X*,0*)| <n{t)N~ a , (65) 

where fi(t) > 0 is bounded with bounded derivatives. 

Theorem 3. Suppose that the assumptions (Al)-(A3) are satisfied, also that the suf¬ 
ficient condition and the hypothesis of Lemma [T| is satisfied, and that the open-loop 
optimal control problem in © (lllbl) is feasible for all t > 0, then the closed-loop system 
in (USD, in the absence of disturbance, with the model predictive control in (1601) and the 
concurrent learning based update law in (1611) determined from the solution of the NLP in 
(1581) (I59d[) . is uniformly ultimately bounded. Moreover, the ultimate bound can be made 
arbitrarily small by the choice of the number of collocation points. 

Proof. It has been shown that the feasibility of the open-loop optimal control problem 
in © (lllbl) implies the feasibility of the NLP in (1551) (I59dl) (See HU). Using LemmaH 
the relationship between the value function of the finite-horizon optimal control problem 
in (lTTl) - (l45bl) and the optimal value of the finite-dimensional NLP in (1551) (I59d[) can be 
expressed as 


j*(x(s)) = j*(x(s)) + m(s)N “, m(s),iii(s) e l 0 
for all s e (t, t + T s ]. Thus, using Lemma [TJ it follows that 


( 66 ) 


J*( x ( s )) < — / ||x(t)|| q + ||u*(r)|| fl + 2/3 0(r) \ dr + m(s)N 


= J*(x(i))-/ ( || x (t)||q + ||u*(r)||jj. + 2/3 0(r) 


dr 


+ (au (s) + h 2 {s))N “, n 2 (s),ii 2 (s) S L 0 


(67) 
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or 


J*(x(s)) < J*(x(t)) - I ( II x (t)||q + ||u*(t)||^j + 2/3 0(r) 


dr 


+ fi(s)N n(s),fi(s)Gh 00 . 
Similarly, to the proof of Theorem [lj define the function 


E(x(i),0(i)) = J*(x(i)) + / P 9{t) 

Jo 

Taking the time derivative of V(x(t), 9{t)) yields 

E(x(s),0(s)) -V(x(t),0(t)) 


dr. 


V(x.(t), 0(t)) = lim 

S—Yt 


= lim 

s—yt 


s — t 

J*(x( S )) - J*(x(i)) , 1 


s — t 


s — t 


0(r) 


dr , 


which, after using (l68l) . can be upper bounded as 


V(x(t), 9(t)) < — lim 


1 


s-tt s — t 


I x ( t )IIq + ll u *(' r )llfl + /3 d ( T ) 


dr 


< - lim ■ 


s-yt s — t 


II x ( t )IIq + £ °( t ) j dr + fi(t)N 


which simplified to 


V{x{t),0(t)) < - || x (t)||g ~P 9{t) +(i{t)N- 


< — I|x(*)||q — J0 9{t) 


■ cN~ 


(68) 


(69) 


(70) 

(71) 


(72) 


(73) 

(74) 


for some c > 0, since /i is bounded. Thus the state and parameter estimation error are 
uniformly ultimately bounded [26] . From (17411 , it is clear that the ultimate bound can be 
made arbitrarily small by choosing N appropriately. □ 


6 Numerical Example 

The following numerical examples are given to demonstrate the proposed control method. 


6.1 Example 1 

Consider a system described by the following ODEs: 

Xl = ( 6*1 + \9 2 X 1 \) X2 + U, 
X 2 = 0 2 X1. 


Here, 


4 If f(t) is 
lim s— jh- t 


f a = 


X 2 0 

sgu{6 2 x\)x\x 2 Xi 


integrable, then there exists a function F(t) such that F f (t) 
ft’ f(r)dr = lim._ t = F'(t) = f(t) 


(75) 

(76) 

fit). Thus 
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and T(r) is chosen as 


r(r) 


x 2 (t) 0 

0 a-’i(r) 


Thus, the condition in (flTl) is satisfied with 

Ai = min j/ Q T x Hl ( T H ) 2 dr H , / Q T (r ff ) 2 dr ff | 

A 2 = max {/o T XHi (T H ) 2 dT H , / 0 T x H2 (t h ) 2 dr H | . 
The recorded data is generated using the open loop control 

u(t) = 0.1sin(5f) + 0.05cos(2t), 


(77) 


(78) 


(79) 


which results in the values of Ai = 0.0021 and A 2 = 0.0155. The measurement sampling 
time is set to T s = 0.4s. As a result, the optimization routine runs for 0.4s until the 
next measurement is available. Meanwhile, within the interval r £ [t, f+ T S ], the control 
algorithm runs in an open loop fashion based on (1601) and (1611) . using the present state 
estimate and predictions. The inverse learning rate is set to kg = 5T s X^/\4 = 0.0309. 
The number of LGR nodes used is 5, and the size of the recorded data used is N m = 50. 



Figure 1: State trajectory, T s = 0.4s 



Figure 2: Control trajectory, T a = 0.4s 
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Figure 3: Parameter estimate trajectory, T s = 0.4s 


Figure Q] shows that the resulting state trajectory converges to the origin asymptotically. 
The control authority is shown in Figure[2] The faint vertical lines show the measurement 
points and how the control is updated at those points. Figure [3] shows that the parameter 
estimates converge to the true parameters. 



Number of LGR nodes 


Figure 4: Effect of the number of LGR nodes on parameter estimation 

As shown in Figure[H the more the number of LGR nodes, the better the “goodness” of 
the parameter estimation. This is because a better approximation of the system dynamics 
is obtained by increasing the number of LGR nodes. As a result, the system parameter 
are better approximated. 

In order to demonstrate the effect of T s on the control system, another simulation is 
carried out with T s = Is. Figure [5] through Figurc[7]show the resulting state, control and 
parameter estimate trajectories. It is seen that the parameter estimate, and consequently 
the control and system response, converges more slowly with increase sampling time. 


6.2 Example 2 

This example demonstrates the special case of linearly parametrized systems. The system 
considered is a mass-spring-damper system whose dynamics is given by 

d 
dt 


Xi 

1 

x 2 





X 2 

- £*2 




(80) 
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Figure 5: State trajectory, T s = Is 
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Figure 7: Parameter estimate trajectory, T s = Is 
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where to, fc, b denote the system mass, spring constant, and damping coefficient values 
respectively. The dynamics is linearly parametrized as follows 


d_ 

dt 


[ Xi 


Xi ] = [ 01 


e 2 


03 04 


X 2 0 

0 — Xi 

0 -X 2 

0 U 


(81) 


18 


































where the unknown parameters are given by B\ = 1 ,62 = k/m , 63 = b/m, O 4 = 1 /m, where 
m = 2kg, k = 5Nm,b = 0.8 Ns/m. Two simulations were carried out; one in which the 
control is unconstrained, and the other in which the constraint |u| < 0.5 is imposed on the 
control authority. Figures 151 through [TOl show the states trajectory, control authority and 
the parameter updates. As expected, it is seen that the settling time for the constrained 
case is longer than the unconstrained case. Note that, in this example, the number of 
unknown parameters is more than the number of states. 




(a) Unconstrained (b) Constrained 


Figure 8: State trajectory 



(a) Unconstrained 



time (sec) 

(b) Constrained 


Figure 9: Control trajectory 
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(a) Unconstrained 
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(b) Constrained 


Figure 10: Parameter estimate trajectory 
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7 Conclusion 


A direct adaptive control technique is presented for use, in conjunction with concurrent 
learning approach, within the framework of model predictive control. The presented 
control technique undermines the need to switch between online learning phase and control 
phase by modulating the control sequences and the parameter estimates simultaneously 
at each computation instant. Theoretical analysis shows that the concurrent learning 
based adaptive model predictive control system is asymptotically stable with asymptotic 
parameter convergence. Numerical simulation results validated the theoretical claims 
and also showed that parameter estimation error decreases with increasing number of 
LGR nodes. However, associated with increased number of LGR points is increased 
computational burden. Therefore, a trade off is necessary between computational burden 
and parameter estimation error. 

In future, the effect of actuator dynamics will be considered. Also, other discretization 
methods will be considered. Candidate discretization methods are; the use of Laguerre 
functions, other collocation methods like Runge-Kutta, etc. 
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